0

I am using string as a key on mapping data structure.

[Q] While we are use mapping, when the length of the key string increases, does it also increase the gas usage?

mapping(string => int) map;

contract.array("mykey");                                 //short key length.
contract.array("mykey_mykey_mykey_mykey_mykey_mykey");   //longer key length is used.

function array(string key){
    map[key] = 10;
}

Thank you for your valuable time and help.

alper
  • 8,395
  • 11
  • 63
  • 152
  • Does this even compile? Generally you would hash the value to create a fixed-length key, which will incidentally have the same gas usage as any other key. – Edmund Edgar Oct 07 '17 at 13:03
  • Actually it does compile, I am able to use it with different key length sizes. @Edmund Edgar – alper Oct 07 '17 at 13:05
  • OK, it's probably hashing the string for you anyhow, in which case you'll have constant gas. – Edmund Edgar Oct 07 '17 at 13:07
  • So no matter what if I pass longer string to the function it won't affect the gas usage right? @EdmundEdgar – alper Oct 07 '17 at 13:10
  • That would be my guess, except for the cost of hashing the input, which should be fairly trivial compared to storage unless your strings are crazy long. – Edmund Edgar Oct 07 '17 at 13:47
  • I have tried string size 2 and 45. it increases around 3000 gas value. I am not sure passing string argument into function also count as additional gas usage. @EdmundEdgar – alper Oct 07 '17 at 14:00

1 Answers1

5

Passing longer strings to your function will use more gas for a number of reasons:

  1. The CALLDATA of the transaction, which contains the parameters passed to the functions of your contract, is charged at 16 gas per non-zero byte (G_txdatanonzero in the Yellow Paper).

  2. Each word of data passed to KECCAK256 by the EVM costs an extra 6 gas (G_sha3word). So if your string is more than 32 bytes long, it will add 6 gas per 32 bytes extra. KECCAK256 is used to turn the string into a key for the mapping lookup.

  3. The compiled code uses CALLDATACOPY to copy the string into memory. This costs 3 gas per 32 byte word (G_copy), so again increases with length of string.

  4. Each extra word of memory used by the EVM costs gas - a longer string will cause more memory to be used. The marginal cost of memory allocation increases quadratically with memory size, so it depends on how much memory you are already using. See equation (222) in the YP.

That's all I can think of for now, but there may be more.

Tomiwa
  • 155
  • 8
benjaminion
  • 9,247
  • 1
  • 23
  • 36
  • Great answer! To understand better, if I pass uint32 instead of passing 45 length string, would it cost less gas? If yes, I will change my implementation and use uint32 as a key instead of using 45 length or longer string. What would be your advice? @benjaminion – alper Oct 07 '17 at 19:08
  • 1
    uint32 would give you a constant cost (nearly! zero bytes in the calldata cost only 4 gas, rather than 68, but at least the gas cost would be bounded). On the whole, hashing the string before sending it seems a bit cleaner to me, but you lose transparency in that the string is no longer available to see in the transaction history on the blockchain. Really depends on your use case and requirements. I wouldn't say it's a big issue. – benjaminion Oct 07 '17 at 19:20
  • Thank you sir. Hash the string, and use its hash as the key seems like the best option because I provide more than one string to the function and I can just merge them. For hashing do you recommend any algorithm to use? I was thinking to use base64 decode/encode but I was not sure since I won't able to use base64 decoded string as uint64 format. @benjaminion – alper Oct 07 '17 at 19:48
  • 1
    Not really my area of expertise, but SHA-256 would produce the right output size (32 bytes/256 bits) and is standard. base64 doesn't give you a constant output size, I think. – benjaminion Oct 07 '17 at 20:06
  • There's no fundamental reason it shouldn't work. You will need to prefix the value with 0x to indicate that it's hexadecimal... – benjaminion Oct 08 '17 at 17:35
  • 1
    Oh that was the reason I was trying in python/populus and giving was as string that was the main reason I guess. @benjaminion – alper Oct 08 '17 at 17:41
  • SHA-256 would produce right size but from the SHA-256 hash we won’t able to obtain the original string. but by using base64 decode/encode, we could obtain the string from decoded version.@benjamininion – alper Oct 11 '17 at 17:18
  • I've updated the yellow paper link to the correct one [1]. However, in that link on page 27, it says that txdatanonzero is actualy 16 not 68.
  • [1] https://ethereum.github.io/yellowpaper/paper.pdf#page=27

    – Tomiwa Jan 29 '22 at 14:01