[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [equinox-dev] [prov] peer-to-peer downloads (ECF)

On Fri, Feb 15, 2008 at 4:08 PM, Jeff McAffer <jeff@xxxxxxxxx> wrote:
> Remy, excuse me if this is really basic and naÃve.  I've only used bittorent
>  to download large single wad files...
>  Would it be reasonable to use bittorent to download *parts* of things?  For
>  example, if you look at an entire repo as one file then when you want just
>  one of the artifacts ...

It would be better not to consider the repo as only one file, but
rather as an index that point to a set of files instead. These might
even be understandable units like bundles or IUs. As for the
individual downloading, what happens is that the file is only
downloaded as a single entity, but that under the covers it makes
multiple network requests to multiple network hosts to achieve that

>  Seems that bittorent clients can look for peers that have parts of
>  files so if there was an index saying which artifacts are in what parts of
>  the torent file, a client could just look for the parts of interest and lay
>  them out on disk appropriately.

That's an internal optimisation, rather than something a user has
control over; somewhat like the way that TCP streams are chunked into
IP across the net; it happens, but you don't (really) know about it.

In terms of externally 'indexable' items, bittorrent tracks them at
the 'file' level. The file is packeted up into diffferent segments
(think downloading http://a.example.com/file.zip{0-10k} and
http://b.example.com/file.zip{10k-20k} etc.)

I guess the interesting question is: why think of a repo as an
individual file? Even for those that don't support bittorrent, it's
much better to think of it as a collection of files. That's one of the
problems with the SDK zips; there's a lot of duplicated data in
individual files. If they were individually addressable, you would
only have to store them once.