0

I am having trouble reshaping and preparing the data for the LSTM network.

I have an array of extracted MIDI Data per song:
(offset, note, duration (in seconds) ):

[(0.5, 69.0, 0.33), (1.0, 69.0, 0.25), (1.33, 67.0, 0.42), (1.75, 65.0, 0.08), (1.75, 67.0, 0.08), ..., (44.83, 65.0, 0.42), (45.5, 67.0, 1.0), (46.5, 69.0, 0.75), (48.5, 69.0, 0.33), (49.0, 69.0, 0.25)]

Code to extract the MIDI Data:
https://pastebin.com/embed_js/SHMk3e3j?theme=dark

Section of code to reproduce error:

def _main():
    with open('ds', 'rb') as filepath:
        ds = pickle.load(filepath)

    print("\nDATA LOADED!\n")
    #+++++++++++++++++++++++++++++++++++++++++

    print("\nCalculating Metrics\n")
    merge = list()

    for idx, song in enumerate(ds):
        for pitch in song:
            merge.append(pitch)
        if idx > 3: break

    print("Checking if all element are the same shape")
    x = list(map(lambda o: len(o), merge))
    print("max : min = {} : {}".format(max(x), min(x)))

    n_vocab = len(set(merge))

    #+++++++++++++++++++++++++++++++++++++++++

    unique_features = sorted(set(unique for unique in merge))

    #+++++++++++++++++++++++++++++++++++++++++

    print("Vocab Size: {}".format(n_vocab))
    print("Unique Data: {}".format(len(unique_features)))

    print("\nMetrics Calculated!\n")

    #   Create Dict for note2int mapping.
    print("\nMapping Data\n")
    noteDict = dict((note, num) for num, note in enumerate(unique_features))
    print("Data Mapped!\n")

    #   Sequenize the data

    print("\nSequenizing Data\n")
    seqLen = 128
    networkInput = list()
    networkOutput = list()

    for i in range(0, len(ds) - seqLen, 1):
        seqIn = merge[i:seqLen + i]
        networkInput.append([noteDict[char] for char in seqIn])

        seqOut = merge[i + seqLen]
        networkOutput.append(noteDict[seqOut])

    x = list(map(lambda o: len(o), networkInput))
    print("{} == {} ?", min(x), max(x))

    # raise Exception("Break Here")

    n_patterns = len(networkInput)
    print("Pattern Size: {}".format(n_patterns))

    print("Before:", networkInput[0])
    #networkInput = np.asarray(networkInput)
    sample_count = n_patterns
    data_points = seqLen
    feature_count = 3
    networkInput = np.reshape(networkInput, (sample_count, data_points, feature_count))

The Error:
2021-07-25 14:17:20.874498: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll

DATA LOADED!


Calculating Metrics

Checking if all element are the same shape
max : min = 3 : 3
Vocab Size: 32651
Unique Data: 32651

Metrics Calculated!


Mapping Data

Data Mapped!


Sequenizing Data

{} == {} ? 128 128
Pattern Size: 928
Before: [22, 41, 44, 54, 55, 56, 62, 149, 190, 208, 256, 263, 286, 295, 312, 458, 502, 506, 514, 515, 516, 529, 549, 647, 690, 707, 737, 738, 742, 758, 798, 809, 825, 865, 890, 930, 953, 998, 1013, 1112, 1135, 1173, 1198, 1237, 1244, 1253, 1256, 1264, 1265, 1266, 1300, 1320, 1369, 1487, 1494, 1512, 1518, 1727, 1746, 1751, 1763, 1764, 1767, 1794, 1854, 1985, 2002, 0, 36, 58, 75, 126, 186, 250, 281, 299, 318, 370, 431, 469, 497, 512, 533, 536, 570, 620, 685, 754, 768, 782, 787, 800, 806, 861, 926, 986, 1001, 1018, 1052, 1105, 1167, 1231, 1246, 1268, 1295, 1348, 1410, 1443, 1469, 1482, 1508, 1517, 1583, 1642, 1699, 1715, 1739, 1768, 1822, 1868, 1896, 1942, 1972, 1996, 2015, 2063, 2139, 2198]
Traceback (most recent call last):
  File ".\main.py", line 152, in <module>
    _main()
  File ".\main.py", line 112, in _main
    networkInput = np.reshape(networkInput, (sample_count, data_points, feature_count))
  File "<__array_function__ internals>", line 5, in reshape
  File "C:\Users\[username]\anaconda3\envs\[project]\lib\site-packages\numpy\core\fromnumeric.py", line 299, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "C:\Users\[username]\anaconda3\envs\[project]\lib\site-packages\numpy\core\fromnumeric.py", line 55, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "C:\Users\[username]\anaconda3\envs\[project]\lib\site-packages\numpy\core\fromnumeric.py", line 44, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: cannot reshape array of size 118784 into shape (928,128,3)

I've looked at different sources and examples on how I can reshape my data.
Here are a few:
https://stackoverflow.com/questions/60028853/cannot-reshape-array-of-size-into-shape
https://stackoverflow.com/questions/63957457/how-to-reshape-to-make-a-3d-input-for-lstm
https://github.com/keras-team/keras/issues/8568
https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/

Please, can anyone point out where I am messing up and perhaps a direction for me to go for?

0 Answers0