Sharding/swarms

Question

I see sharding and swarms are in the IOTA roadmap. The question is why is sharding necessary? Can it be compared to the way Ethereum plans to shard?

The only info I managed to find was a paragraph on the IOTA blog which is vague.

I don’t know if I correctly understood the meaning of sharding and swarms. Where can I read about them in IOTA roadmap? — blockmined, Jan 24 '18 at 13:44
Here: https://blog.iota.org/iota-development-roadmap-74741f37ed01 — user3223162, Jan 24 '18 at 14:14
IoT does not have a lot of processing and storage, It just make sense to keep the portion of the tangle (shard) that is related and of interest for the IoT device(s). — AlbertK, Jan 26 '18 at 07:20
I understand the general concepts of sharding, I'm more interested on how is this to be achieved. — user3223162, Jan 26 '18 at 08:34

score 3 · Answer 1 · answered Feb 01 '18 at 19:34

3

Sharding is the process of dividing up data. For example, if you had a database containing information about all of the people alive today it could be quite large (depending on what information you store about each person).

Sharding lets you divide the dataset up across machines based on some criteria. For example, you could set up X shards and choose which shard a record goes to based on the year the person was born.

When you want to list everyone named "Smith" then each of the shards is contacted and told to return a list of IDs of people named "Smith" - the client (your computer) then puts the results from all shards together to give you a complete picture (all Smiths from all shards) - but showing only the data you asked for (say ID, GivenName, FamilyName, Title) so that it fits into memory.

answered Feb 01 '18 at 19:34

Peter Morris

131
2

Does it in any way relate to processing power? And most importantly, does it anything to do between splitting transaction processing as described in https://iota.stackexchange.com/questions/1320/full-node-influence-on-scalability?rq=1 – user3223162 Feb 05 '18 at 10:33
1

It's to do with storage restrictions and/or parallelization. For example, if you want an average transaction amount you would scan your own data and for each record add to an accumulator in the format TotalValue (number), TransactionCount (number) - You can then ask other shards to do the same. Importantly you can then add TotalValue from each response and also TransactionCount from each, and then perform the same averaging algorithm as if they were simply local records. Google MapReduce – Peter Morris Feb 05 '18 at 14:00
I'm aware of MapReduce and it's benefits and sharding is clear the way you describe it, however I would like to know how it's related to actual transaction processing in the network. Taking Ethereum as an example - every node has to process every transaction, hence it's so slow and needs to shard so that every miner does not need to process every transaction. Judging by the link provided above every full node has to validate every transaction eventually. I was wondering if sharding is related to that fact – user3223162 Feb 06 '18 at 14:31
It looks like a full node will validate everything, and sharded nodes will validate their subset - but will query other shards/full-nodes to ensure there is no double-spend etc (i.e. part of the validation). – Peter Morris Feb 06 '18 at 16:01
That seems logical to me as well, however do you have any reference to back up your claims? – user3223162 Feb 07 '18 at 10:11
@user3223162 No - The pattern I describe is standard sharding practice. The validation part is a guess, either they will do that or they don't need to. – Peter Morris Feb 07 '18 at 14:12
I agree on your general descriptions, however I would like have a valid reference concerning IOTA in particular, so I will leave the question open. Thanks for the input! – user3223162 Feb 07 '18 at 15:22

Sharding/swarms

1 Answers1