3

I am interested in exploring the data in ~/.ethereum/nodes, which I believe contains previous connection attempts and information about the nodes. (Is that correct?) But I am stuck trying to pull the data from that database.

Here is my attempt so far:

import leveldb

db = leveldb.LevelDB("./nodes")
# I don't know the format so I will iterate a few keys
keys = list()
for k in db.RangeIter():
    if len(keys) > 10:
        break
    keys.append(k)

print(keys[0][0])

Which gives me a byte array that I don't know what to do with:

>>> bytearray(b'n:\x00\x00\x07\xf2\x91\xff\xcd\xba%\x8f%\xf8b\xfe\x1b3\xda\x10\xfa,\xb7>\x93\x82_X\r5\xdfG\xae\x8b\xd6-\x9d6\rB\x84$\xb8+\x07\x18<\x8d\xed\xca\x93\xa4\x0bt\x84\xa7\x14\xaf\xc8B\x1a\xb3\xb7(K\x00:discover:lastping')

Sorry if this is more of a python than an Ethereum question, but I suspect knowing the structure of the data would help and I can't seem to find it anywhere.

UPDATE: I should clarify what I am working with:

keys[0] is an entry in the nodes database and is a tuple:

( bytearray(b'n:\x00\x00\x07\xf2\x91\xff\xcd\xba%\x8f%\xf8b\xfe\x1b3\xda\x10\xfa,\xb7>\x93\x82_X\r5\xdfG\xae\x8b\xd6-\x9d6\rB\x84$\xb8+\x07\x18<\x8d\xed\xca\x93\xa4\x0bt\x84\xa7\x14\xaf\xc8B\x1a\xb3\xb7(K\x00:discover:lastping'), bytearray(b'\x90\xa7\xae\xef\n') )

With the first item being the key. Running rlp.decode on either item returns an error:

import rlp

rlp.decode(bytes(keys[0][0]))
>>> DecodingError: RLP string ends with 83 superfluous bytes

rlp.decode(bytes(keys[0][1]))
>>> DecodingError: RLP string ends with -12 superfluous bytes

I'm very confused because it looks like it's being RLP encoded by geth so I don't understand why it would fail to decode.

ethereum_alex
  • 803
  • 1
  • 9
  • 19

1 Answers1

4

The contents of the database are blob-ified, so you'll have to de-blobbify them to get anything human-readable.

The layout can be found in database.go:

// Schema layout for the node database
var (
    nodeDBVersionKey = []byte("version") // Version of the database to flush if changes
    nodeDBItemPrefix = []byte("n:")      // Identifier to prefix node entries with

    nodeDBDiscoverRoot      = ":discover"
    nodeDBDiscoverPing      = nodeDBDiscoverRoot + ":lastping"
    nodeDBDiscoverPong      = nodeDBDiscoverRoot + ":lastpong"
    nodeDBDiscoverFindFails = nodeDBDiscoverRoot + ":findfail"
)

If you look through that file you'll see the RLP (Recursive length prefix) package being used for the encoding/decoding, the Python version of which can be found here.

Fortune
  • 502
  • 3
  • 17
Richard Horrocks
  • 37,835
  • 13
  • 87
  • 144
  • Thanks for pointing this out. I'm still having trouble "de-blobbifying" the index: rlp.decode(b'\x90\xa7\xae\xef\n') --> DecodingError: RLP string ends with -12 superfluous bytes. – ethereum_alex Jun 14 '16 at 18:31
  • Hmm, it's possible you'll have to pass as second (de-)serialisation argument to the decode function so it knows what type it's dealing with. (Which is how the Go code seems to do it.) Have a look at "Sedes objects" in the Python documentation: https://github.com/ethereum/pyrlp/blob/develop/docs/tutorial.rst – Richard Horrocks Jun 14 '16 at 22:08
  • This check actually comes before the sedes check: https://github.com/ethereum/pyrlp/blob/develop/rlp/codec.py#L211 – ethereum_alex Jun 14 '16 at 22:33