[Haifux] Lecture Suggestion - Disco Project, an open source Map-Reduce framework based on Erlang and Python (mostly Python :-) )

Eran Sandler eran at sandler.co.il
Thu Jan 15 09:24:24 MSK 2009


Thanks for the explanation Orna :-)

It would also be interesting to have a lecture on Hadoop just to show
another Map-Reduce implementation out on the wild and also to compare Disco
to Hadoop.
I guess an additional follow up lecture on Hadoop can work but I'm less
familiar with it (and all of its components - and it has quite a few) than I
am with Disco.

Eran

On Wed, Jan 14, 2009 at 11:13 PM, Orna Agmon Ben-Yehuda
<ladypine at gmail.com>wrote:

>
> In short, this is the method google use for parallel processing. Every
> operation which needs to be parallel is divided into a mapping stage (where
> each worker does something on their own data, and produces a result) and a
> reduction stage, where the results of the map are collected into a
> meaningful result. This parallelization scheme is highly scalable.
>
> This (upcoming) Sunday's slides on Map-Reduce from the concurrent and
> Distributed Programming course:
>
> http://webcourse.cs.technion.ac.il/236370/Winter2008-2009/ho/WCFiles/map-reduce-lecture.pdf
> And a canonical paper, linked from the same place:
> http://labs.google.com/papers/mapreduce-osdi04.pdf
>
> Orna.
>
>
> On Wed, Jan 14, 2009 at 10:27 PM, guy keren <choo at actcom.co.il> wrote:
>
>>
>> can you explain, briefly, what map-reduce is, so those not in the know
>> (like me) will be able to decide if this is interesting? ;)
>>
>> thanks,
>> --guy
>>
>> Eran Sandler wrote:
>> > Hi all,
>> >
>> > It's been a while since I've posted to Haifux (or Linux-IL for that
>> > matter) but I am watching the mailing list from time to time and due to
>> > personal reasons found myself as a Haifa citizen for the past year (and
>> > probably for a couple more years :-) ).
>> >
>> > Somewhere in 2004 I even did a lecture on Mono, the open source .NET
>> > implementation, if some of you recall.
>> >
>> > Recently I've been involved with a cool open source project called
>> Disco.
>> >
>> > Disco is an open source Map-Reduce framework written in Erlang and
>> > Python. It was written at Nokia's Palo Alto research center as a
>> > lightweight framework for rapid scripting of distributed data processing
>> > tasks but grew to become even more than that and is now even used for
>> > probabilistic modeling, data mining, full text indexing, etc.
>> >
>> > You can read more about Disco at http://discoproject.org
>> >
>> > Would a lecture on Map-Reduce in general and specifically Disco would
>> > interest people?
>> >
>> > If so, I'm more than willing to give the lecture and show some examples.
>> >
>> > Eran
>> >
>> >
>> > ------------------------------------------------------------------------
>> >
>> > _______________________________________________
>> > Haifux mailing list
>> > Haifux at haifux.org
>> > http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
>>
>> _______________________________________________
>> Haifux mailing list
>> Haifux at haifux.org
>> http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://haifux.org/pipermail/haifux/attachments/20090115/d436976a/attachment.html 


More information about the Haifux mailing list