Rare hang #58

HyperCodec · 2024-05-15T14:36:05Z

Not sure how this is happening but in extremely rare circumstances it is possible to hang indefinitely. See #57 workflow run for more info.

My guess is there is one very small outlying situation that causes a rwlock to be locked and used by a child node, but this shouldn't be possible with the well-tested circulation prevention algorithm. This definitely requires further debugging, but it is so rare and obscure that it is difficult to catch it and the details about what happened.

Bowarc · 2024-05-31T20:05:36Z

After a couple of generation, it just stops and i don't know why.

It appears to be 100% of the time with my current test, I pushed it at https://github.com/Bowarc/doodlai_jump/tree/ea955a6b681fcbaa2a4e3ec6d81f14970d5414b7

(The /ring package is responsible for training (the one hanging after a couple of generations), game is a lib for a rly simple version of doodle jump and display is to see the ai play)

HyperCodec · 2024-05-31T21:03:06Z

After a couple of generation, it just stops and i don't know why.

It appears to be 100% of the time with my current test, I pushed it at https://github.com/Bowarc/doodlai_jump/tree/ea955a6b681fcbaa2a4e3ec6d81f14970d5414b7

(The /ring package is responsible for training (the one hanging after a couple of generations), game is a lib for a rly simple version of doodle jump and display is to see the ai play)

Hmm so it's probably something with a recursive RwLock. I'll have to look into it further. It's probably some internal function causing a cyclic neuron dependency (like DFS not working or something).

HyperCodec · 2024-05-31T21:29:30Z

Btw @Bowarc can you use the serde feature to dump a json (or ron) file on the generation that hangs? (Probably the easiest way to do this would be to overwrite the same file with each generation and then stop the program when it hangs)

Bowarc · 2024-05-31T22:04:47Z

Ok, i'll do that tomorrow

Bowarc · 2024-05-31T23:39:42Z

Well, i stayed up longer than expected 😅
Here is the dna of every genome of a sim that froze

DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [RwLock { data: NeuronTopology { inputs: [(Input(3), -0.5625169)], bias: 0.59482414, activation: sigmoid
 }, poisoned: false, .. }], output_layer: [RwLock { data: NeuronTopology { inputs: [(Hidden(0), 0.9505495)], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.97373414), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }
DNA { network: NeuralNetworkTopology { input_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.19538373, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9611819, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5509694, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.31042653, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9654784, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.81183213, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.86611843, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.9298546, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8283311, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.8759112, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.4996699, activation: linear_activation
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [], bias: 0.5423544, activation: linear_activation
 }, poisoned: false, .. }], hidden_layers: [], output_layer: [RwLock { data: NeuronTopology { inputs: [], bias: 0.010660529, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(6), -0.1170296)], bias: 0.39411813, activation: sigmoid
 }, poisoned: false, .. }, RwLock { data: NeuronTopology { inputs: [(Input(7), -0.8857193), (Input(11), -0.97913766), (Input(2), 0.2923255), (Input(1), 0.26824117), (Input(2), -0.8934064), (Input(10), -0.19709682), (Input(9), -0.92098737), (Input(6), -0.9772694), (Input(7), 0.08727813), (Input(3), -0.61651254), (Input(9), 0.42674088), (Input(7), -0.801528), (Input(1), 0.6078919)], bias: 0.7993354, activation: sigmoid
 }, poisoned: false, .. }], mutation_rate: 0.01, mutation_passes: 3 } }

made a new commit if you wanna check it 8d75367

HyperCodec · 2024-06-01T01:12:15Z

Well, i stayed up longer than expected 😅 Here is the dna of every genome of a sim that froze...

Something I noticed here is that there are a lot inputs for Input layer neurons on one of the output neurons for each genome. I doubt this is just a result of evolution or something because of that huge ratio between it and the other neurons. Probably another issue to fix.

Anyways, I created #61 for the duplicate neuron references that are in the inputs to that output neuron.

HyperCodec · 2024-06-01T01:52:32Z

Merged #62, which is the main suspect of this issue.

@Bowarc Can you try to run with neat = { git = "https://github.com/hypercodec/neat", branch = "dev", features = ["whateveryouhadbefore"] } and see if it still hangs?

Bowarc · 2024-06-01T16:08:38Z

I've now tested over 3k generations, it seems to be stable, thank you for the fix (i had ["crossover", "rayon", "serde"] as features)

HyperCodec · 2024-06-01T17:00:18Z

Np

HyperCodec · 2024-06-01T17:01:15Z

You can use dev branch for now but it's not a good branch to stay on bc of large api changes, def change back to stable after next release.

Bowarc · 2024-06-01T17:01:45Z

Alright, thanks !

Bowarc · 2024-06-01T17:25:07Z

Oh
I swapped to DivisionReproduction (i was on CrossoverReproduction before) and first try after about 125 generations it deadlocked

Here is the simulation data
sim.backup.txt

I tried more tests, even went back to CrossoverReproduction w/ crossover_pruning_nextgen but it appears to be deadlocking 100% of the time again.
After more tests i found that if i have a too low number of genome per generation (<100) it deadlocks in about 10/100 gens
~~Seems fine with 1000 genomes / gen~~

DivisionReproduction hangs after a bit with 1000 genomes, here is the sim data:
sim.backup.txt

HyperCodec · 2024-06-01T20:21:22Z

I swapped to DivisionReproduction (i was on CrossoverReproduction before) and first try after about 125 generations it deadlocked

I tried more tests, even went back to CrossoverReproduction w/ crossover_pruning_nextgen but it appears to be deadlocking 100% of the time again. After more tests i found that if i have a too low number of genome per generation (<100) it deadlocks in about 10/100 gens ~~Seems fine with 1000 genomes / gen~~

DivisionReproduction hangs after a bit with 1000 genomes, here is the sim data: sim.backup.txt

Interesting that it made it through ~3k generations without deadlocking when on CrossoverReproduction the first time but not the second time. Perhaps you just got really lucky on that run. At least this eliminates the premise that the double neuron input thing is causing a deadlock (although it probably also was causing a deadlock in and of itself, maybe there are just multiple issues here)

HyperCodec · 2024-06-01T20:23:21Z

After looking through your backup files, I noticed that there are still duplicate inputs. I am not sure this time how they are being made.

Bowarc · 2024-06-02T11:41:40Z

While testing performances & learning curves, i found out that high mutation rate (=>0.1) deadlocks in less than 50 gens 100% of the time, and now that i think of it, it might be the difference between me saying that it looks good and me saying that it doesn't work again

Example:

pub const NB_GAMES: usize = 3;
pub const GAME_TIME_S: usize = 20; // Nb of secconds we let the ai play the game before registering their scrore
pub const GAME_DT: f64 = 0.05; // 0.0166
pub const NB_GENERATIONS: usize = 100;
pub const NB_GENOME_PER_GEN: usize = 2000;

neat::NeuralNetworkTopology::new(0.2, 3, rng)

Deadlocks in 15 generations

sim15.backup.txt

HyperCodec · 2024-06-02T16:29:59Z

While testing performances & learning curves, i found out that high mutation rate (=>0.1) deadlocks in less than 50 gens 100% of the time, and now that i think of it, it might be the difference between me saying that it looks good and me saying that it doesn't work again

Example:
pub const NB_GAMES: usize = 3;

pub const GAME_TIME_S: usize = 20; // Nb of secconds we let the ai play the game before registering their scrore

pub const GAME_DT: f64 = 0.05; // 0.0166

pub const NB_GENERATIONS: usize = 100;

pub const NB_GENOME_PER_GEN: usize = 2000;



neat::NeuralNetworkTopology::new(0.2, 3, rng)
Deadlocks in 15 generations

sim15.backup.txt

So yeah the deadlock issue is probably one of the mutations.

HyperCodec · 2024-06-07T02:51:17Z

I wonder if the deadlock might be happening during the mutation phase, leading to something that can't be accurately debugged as it hasn't finished mutating the neural network before it deadlocks.

HyperCodec · 2024-06-07T04:47:36Z

Might not necessarily mean anything, but just ran some stress tests and such on windows in dev branch (rayon and crossover) and it didn't deadlock once.

Either I'm just really lucky or this has something to do with platform-specific things.

Bowarc · 2024-06-07T10:02:18Z

Have you tried high mutation rate ?

HyperCodec · 2024-06-07T20:09:23Z

Yeah I just got lucky, it happens on any platform.

I did more testing and found that the deadlock is during the running phase, meaning that it's still probably some type of recursive RwLock.

HyperCodec · 2024-06-13T14:41:45Z

Still can't find this deadlock even after weeks, it's being really evasive.

It's almost certainly a recursive RwLockor duped input, but I have code to prevent both of those from happening.

I thought it might be something like those while loops that reroll until a valid state is reached infinitely looping because there is no valid state, but the deadlock doesn't happen during mutation so it can't be that (although probably do want to patch that, it's extremely rare and unlikely to ever happen but is still a possibility).

I'm really just out of ideas for what could possibly cause this issue.

HyperCodec · 2024-06-13T14:53:00Z

While I think this is definitely a high-priority issue that urgently needs to be fixed, I'll take a break from it so it doesn't keep taking time away from new features and such.

HyperCodec · 2024-07-12T17:00:52Z

I think I found the cause of the issue: if all threads have a lock waiting on other tasks, rayon has no way to access and run those dependency tasks.

HyperCodec · 2024-07-12T17:28:23Z

created rayon-rs/rayon#1181, waiting for confirmation on a solution. if rayon takes too long to introduce a fix I can probably make a temporary fix here.

dsgallups · 2024-09-15T17:16:04Z

I'm not sure if this helps; I've been working on a crate based on yours and noticed that the network topology is able to create cycles in the data structure of the neural network. Please let me know if I'm missing something! (drawing a picture real quick)

dsgallups · 2024-09-15T17:21:44Z

Visual example attached

While NeuralNetworkTopology::mutate checks for duplicate inputs, it does not appear to resolve graph cycles. I think back edge detection would work here.

Edit: I've implemented this here

HyperCodec · 2024-09-15T23:37:37Z

Visual example attached

While NeuralNetworkTopology::mutate checks for duplicate inputs, it does not appear to resolve graph cycles. I think back edge detection would work here.

Edit: I've implemented this here

I had a DFS algorithm that was attempting to resolve these loops. Pretty sure I had it working but kind of hard to tell with how random things are in genetic simulations.

https://github.com/HyperCodec/neat/blob/main/src/topology/mod.rs#L119

I've also narrowed this down to pretty much only ever happening with the rayon feature enabled, so I'm thinking it's probably some lock collisions. The cpu usage goes down a ton, which also suggests that the threads are paused.

HyperCodec · 2024-09-15T23:49:50Z

Now that I think about it, I should really use seeded rng when testing these things so get rid of some of the randomness.

dsgallups · 2024-09-24T15:46:22Z

Found it in my fork. On deeply nested structures, par_iter(...).sum will be blocked on all threads, and therefore, no values can return when the summation of inputs occurs:

neat/src/runnable.rs

Lines 106 to 111 in 228f7af

    
           .par_iter() 
        
           .map(|&(n2, w)| { 
        
               let processed = self.process_neuron(n2); 
        
               processed * w 
        
           }) 
        
           .sum();

the sum operation can never complete, even if all of the iterator's components have returned. This is because, at the instant the final child completes, the thread is returned to the pool. Before the sum operation is provided to this last open thread, that thread is allocated to another par_iter that will block. Then, all other threads in rayon's thread pool are blocked (of which some are waiting this node's function to return), and cannot be given out to complete the sum op. I had posted proof of concept but have since moved my repo visibility to private.

HyperCodec · 2024-09-24T15:52:42Z

Found it in my fork. On deeply nested structures, par_iter(...).sum will be blocked on all threads, and therefore, no values can return when the summation of inputs occurs:

neat/src/runnable.rs

Lines 106 to 111 in 228f7af

.par_iter()

.map(|&(n2, w)| {

let processed = self.process_neuron(n2);

processed * w

})

.sum();

the sum operation can never complete, even if all of the iterator's components have returned. This is because, on the final return of the iterator, the thread is returned to the pool. Before the sum operation is provided to the open thread, that thread is allocated to another par_iter that will block. Then, all other threads in rayon's thread pool are blocked (of which some are waiting this node's function to return), and cannot be given out to complete the sum op. I had posted proof of concept but have since moved my repo visibility to private.

Are you sure this is because of lazy stacked sum and not the call to map before it, which uses rwlocks and such?

If sum is causing this, then would converting back to single-threaded iterator after mapping solve this issue?

dsgallups · 2024-09-24T15:53:27Z

Good point! Lemme make a real fork rq with rayon and run it with a high SplitConnection mutation rate and compare.

dsgallups · 2024-09-24T16:12:02Z

Ah, you were right. for_each also does not complete, even after the result is returned. I essentially have been using trace to determine this. here's the details, trying a RwLock instead of using .sum:

        let mut sum = RwLock::new(0.);

        self.inputs()
            .unwrap()
            .par_iter()
            .enumerate()
            .for_each(|(idx, input)| {
                info!(
                    "{} REQUEST INPUT ({}/{})",
                    self.id_short(),
                    idx,
                    num_inputs - 1
                );
                let res = input.get_input_value(self.id_short(), idx);
                info!(
                    "{} RECEIVED INPUT ({}/{}) ({})",
                    self.id_short(),
                    idx,
                    num_inputs - 1,
                    res
                );
                let mut sum = sum.write().unwrap();
                *sum += res;
            });

        info!("{} RETURNING RESULT FROM INPUTS", self.id_short());

        let sum = sum.into_inner().unwrap();
        self.activated_value = Some(sum);

The following log identifies a neuron that has received back all its inputs. However, the function never returns. Logs follow this view from other threads, but the last info trace of this particular node isn't called.

2024-09-24T16:05:14.445053Z  INFO candle_neat::simple_net::neuron: 398ba9 RECEIVED INPUT (0/1) (0)
2024-09-24T16:05:14.445084Z  INFO candle_neat::simple_net::neuron: 398ba9 RECEIVED INPUT (1/1) (0)

dsgallups · 2024-09-24T16:32:58Z

One interesting property to note is that, at least on my end, attaching by_uniform_blocks(1) to the parallel iterator stops this blocking behavior...at least that's what I've found after running a super high split rate after 5-6 minutes...I'm pretty sure this just makes the iterator sequential, but yeah lol

HyperCodec · 2024-09-25T11:52:15Z

This is a little diagram I made explaining my earlier theory. I'm not sure what can be done to prevent this without completely forking rayon (making a custom lock type compatible with it) or making some hacky spinlock solution with tons of rayon::yield_now() calls

HyperCodec · 2024-09-25T11:56:28Z

The reason this doesn't always deadlock is because rayon is work-stealing, meaning if any thread finishes before the others (as in the dependency task is the first one to be added to its queue or all the base tasks are on some other thread) it can steal tasks from the waiting threads, preventing a deadlock.

This deadlock only happens when all threads have a waiting task at the start of their queue, which isn't super common (and gets much rarer with each CPU core added).

HyperCodec · 2024-09-25T16:50:18Z

@dsgallups would you be able to look into this a bit? There is an issue on the rayon GitHub about it (rayon-rs/rayon#592) but it's been open since 2018 and doesn't appear like it's going to be fixed any time soon.

HyperCodec · 2024-09-25T16:51:59Z

It looks from that issue that there is a workaround with a custom ThreadPool for locking stuff but not sure how well that'll work with a recursive algorithm like this.

dsgallups · 2024-09-25T17:55:45Z

If this was async, I'd know how to handle this with tokio since threads can rejoin the thread pool across await boundaries...it's an interesting challenge to determine when an iterator is ready to complete when all threads are being blocked. I'll take a look into it

edit: Going to see if rayon-rs/rayon#1175 is a quick win

dsgallups · 2024-10-01T15:28:31Z

Unfortunately, I've decided not to pursue debugging rayon. I'm opting to do network expansion, transforming the network into a set of tensors as defined here and running on candle-rs. Hope someone else will be able to figure this one out! Just wanted to give an update. Thanks for your efforts!

HyperCodec added the bug Something isn't working label May 15, 2024

HyperCodec self-assigned this May 15, 2024

HyperCodec mentioned this issue Jun 1, 2024

A neuron can depend on the same input multiple times #61

Closed

HyperCodec closed this as completed Jun 1, 2024

HyperCodec reopened this Jun 1, 2024

HyperCodec mentioned this issue Jul 8, 2024

Replace the current neuron caching algorithm #72

Open

Rare hang #58

Rare hang #58

Comments

HyperCodec commented May 15, 2024

Bowarc commented May 31, 2024 • edited Loading

HyperCodec commented May 31, 2024

HyperCodec commented May 31, 2024

Bowarc commented May 31, 2024

Bowarc commented May 31, 2024 • edited Loading

HyperCodec commented Jun 1, 2024

HyperCodec commented Jun 1, 2024 • edited Loading

Bowarc commented Jun 1, 2024 • edited Loading

HyperCodec commented Jun 1, 2024

HyperCodec commented Jun 1, 2024

Bowarc commented Jun 1, 2024

Bowarc commented Jun 1, 2024 • edited Loading

HyperCodec commented Jun 1, 2024

HyperCodec commented Jun 1, 2024

Bowarc commented Jun 2, 2024 • edited Loading

HyperCodec commented Jun 2, 2024

HyperCodec commented Jun 7, 2024

HyperCodec commented Jun 7, 2024 • edited Loading

Bowarc commented Jun 7, 2024

HyperCodec commented Jun 7, 2024

HyperCodec commented Jun 13, 2024

HyperCodec commented Jun 13, 2024

HyperCodec commented Jul 12, 2024 • edited Loading

HyperCodec commented Jul 12, 2024

dsgallups commented Sep 15, 2024

dsgallups commented Sep 15, 2024 • edited Loading

HyperCodec commented Sep 15, 2024 • edited Loading

HyperCodec commented Sep 15, 2024

dsgallups commented Sep 24, 2024 • edited Loading

HyperCodec commented Sep 24, 2024 • edited Loading

dsgallups commented Sep 24, 2024 • edited Loading

dsgallups commented Sep 24, 2024 • edited Loading

dsgallups commented Sep 24, 2024 • edited Loading

HyperCodec commented Sep 25, 2024

HyperCodec commented Sep 25, 2024 • edited Loading

HyperCodec commented Sep 25, 2024

HyperCodec commented Sep 25, 2024

dsgallups commented Sep 25, 2024 • edited Loading

dsgallups commented Oct 1, 2024

Bowarc commented May 31, 2024 •

edited

Loading

Bowarc commented May 31, 2024 •

edited

Loading

HyperCodec commented Jun 1, 2024 •

edited

Loading

Bowarc commented Jun 1, 2024 •

edited

Loading

Bowarc commented Jun 1, 2024 •

edited

Loading

Bowarc commented Jun 2, 2024 •

edited

Loading

HyperCodec commented Jun 7, 2024 •

edited

Loading

HyperCodec commented Jul 12, 2024 •

edited

Loading

dsgallups commented Sep 15, 2024 •

edited

Loading

HyperCodec commented Sep 15, 2024 •

edited

Loading

dsgallups commented Sep 24, 2024 •

edited

Loading

HyperCodec commented Sep 24, 2024 •

edited

Loading

dsgallups commented Sep 24, 2024 •

edited

Loading

dsgallups commented Sep 24, 2024 •

edited

Loading

dsgallups commented Sep 24, 2024 •

edited

Loading

HyperCodec commented Sep 25, 2024 •

edited

Loading

dsgallups commented Sep 25, 2024 •

edited

Loading