How to implement Matrix Multiplication using Map-Reduce?

There is one use case that we have to implement Matrix multiplication using Map Reduce.

Matrix multiplication using Map Reduce_1.gif


M Redue rdigm is the sul f distributed rllel ressing in Big Dt.

Befre writing the de lets first rete mtries nd ut them in HDFS.

  • rete tw files M1, M2 nd ut the mtrix vlues. (serte lumns with ses nd rws with line brek)


Matrix_values_2.JPG


  • ut the bve files t HDFS t ltin /user/luders/mtries/


HDFS_3.JPG


Lets strt the de

We need t rete tw rgrms Mer nd Reduer.


Mer.y

  • First, define the dimensins f the mtries (m,n)


mapper_py.JPG


Red eh line i.e rw frm stdin nd slit then t serte elements. M int t eh element s we red elements s string frm stdin.

stdin.JPG


The mer will first red the first mtrix nd then the send. T differentite them we n kee unt i f the line number we re reding nd the first m_r lines will belng t the first mtrix.

m_r_lines.JPG


Nw mes the ruil rt, rinting the key vlue. We need t think f key whih will gru elements tht need t be multilied, elements tht need t be summed nd elements tht belng t the sme rw.

{0} {1} {2} re the rt f key nd {3} is the vlue.

T understnd hw I ssigned key, lets refer t the belw imge.

assign_key.jpg


{0} {1} {2} tully reresents the sitin f element frm r B t *B

  • {0} is the rw sitin f the element
  • {1} is the lumn sitin f the element
  • {2} is the sitin f the element in dditin. (like 1, 6 re t sitin 0 in dditin nd 2,5 re t sitin 1)

We n see tht s element is reeted Bs number f lumn times i.e. 2 nd Bs element is reeted s number f rw times i.e. 2.

In the rgrm

  • i is used t iterte thrugh eh rw
  • j is used t iterte thrugh eh lumn
  • k is used t iterte thrugh eh dulite rdued

Fr eh element in mtrix :


  • Element remins in sme rw, therefre {0}=i
  • Element is dulited nd distributed t eh lumn, therefre, lumn s in *B = Dulitin rder f element i.e. {1}=k
  • s yu n see in the iture, the sitin f the element, in dditin, is the sme s its lumns number therefre {2}=j


Fr eh element in mtrix B:


  • Elements remin in the sme lumn, therefre {1}=j
  • Element is dulited nd distributed t eh rw, therefre, rw s in *B = Dulitin rder f element i.e {0}=k
  • s yu n see in the iture, the sitin f the element, in dditin, is the sme s its rws sitin therefre {2}=i-m_r

utut f Mer.y

cloudera.JPG


If yu will lk lsely yu will relize tht elements with the sme key (first 3 numbers re key), will get multilied. Elements with the sme first tw numbers f the key re rt f the sme sum nd elements with sme first num f key belng t the sme rw.

fter mer rdues utut, Hd will srt by key nd rvide it t reduer.y


Reduer.y

ur reduer rgrm will get srted mer result whih will lk like this.

reducer_py.png


If yu lk lsely t the utut nd imge f mtrix multilitin, yu will relize:

  • Every 2 numbers need t be multilied
  • Every m_ multilied results need t get summed
  • Every n_ summed result belng t the sme rw
  • There will be m_r number f rws

fter the bve bservtin, the reduer de seems esier.

reducer_Code.JPG


Running the M-Redue Jb n Hd

Yu n run the m redue jb nd view the result by the fllwing de (nsidering yu hve lredy ut inut files in HDFS)

HDFS_Code.JPG


This will tke sme time s Hd d its ming nd reduing wrk. fter the suessful mletin f the bve ress view the utut by:

HDFS_cloudera.JPG


bve mmnd shuld utut the resultnt mtrix

resultant_matrix.png


This bve de is nt limited t ny size. We n multily mtries f ny vlid size by hnging inut nd dimensins in the de.


Original post can be found here.

Interested in upgrading your skills? Check out our trainings.

Siddharth Garg
Software Development Engineer

Share the knowledge

Still have questions?
Connect with us
Thank you.
Your request has been received.
Thank you!
The form has been submitted successfully.