ParallelReverseAutoDiff 1.0.2
See the version list below for details.
dotnet add package ParallelReverseAutoDiff --version 1.0.2
NuGet\Install-Package ParallelReverseAutoDiff -Version 1.0.2
<PackageReference Include="ParallelReverseAutoDiff" Version="1.0.2" />
paket add ParallelReverseAutoDiff --version 1.0.2
#r "nuget: ParallelReverseAutoDiff, 1.0.2"
// Install ParallelReverseAutoDiff as a Cake Addin #addin nuget:?package=ParallelReverseAutoDiff&version=1.0.2 // Install ParallelReverseAutoDiff as a Cake Tool #tool nuget:?package=ParallelReverseAutoDiff&version=1.0.2
ParallelReverseAutoDiff
Parallel Reverse Mode Automatic Differentiation in C#
ParallelAutoDiff is a thread-safe C# library for reverse mode automatic differentiation, optimized for parallel computation. It leverages semaphores and locks to coordinate between threads, ensuring accuracy during gradient accumulation. Each operation in the library is implemented as a node with a forward and a backward function, facilitating efficient calculation of derivatives. A unique aspect of this library is its use of the visitor pattern: it includes a specialized 'Neural Network Visitor' which traverses neural network nodes across different threads. This visitor is responsible for gradient accumulation on nodes shared across multiple threads. This design allows for parallelized computations while maintaining consistency and avoiding race conditions. The result is an efficient, scalable automatic differentiation solution, ideal for machine learning applications and neural network training.
Supported Operations
AmplifiedSigmoidOperation - Used for gradient amplification
ApplyDropoutOperation
HadamardProductOperation
LayerNormalizationOperation
LeakyReLUOperation
MatrixAddOperation
MatrixAddThreeOperation
MatrixMultiplyOperation
MatrixMultiplyScalarOperation
MatrixTransposeOperation
SigmoidOperation
SoftmaxOperation
StretchedSigmoidOperation
TanhOperation
Usage
Create architecture JSON file
Here is an example:
{
"timeSteps": [
{
"startOperations": [
{
"id": "projectedInput",
"description": "Multiply the input with the weight matrix",
"type": "MatrixMultiplyOperation",
"inputs": [ "We", "inputSequence[t]" ],
"gradientResultTo": [ "dWe", null ]
},
{
"id": "embeddedInput",
"description": "Add the bias",
"type": "MatrixAddOperation",
"inputs": [ "projectedInput", "be" ],
"gradientResultTo": [ null, "dbe" ]
}
],
"layers": [
{
"operations": [
{
"id": "wf_currentInput",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wf[layerIndex]", "currentInput" ],
"gradientResultTo": [ "dWf[layerIndex]", null ]
},
{
"id": "uf_previousHiddenState",
"type": "MatrixMultiplyOperation",
"inputs": [ "Uf[layerIndex]", "previousHiddenState" ],
"gradientResultTo": [ "dUf[layerIndex]", null ]
},
{
"id": "f_add",
"type": "MatrixAddThreeOperation",
"inputs": [ "wf_currentInput", "uf_previousHiddenState", "bf[layerIndex]" ],
"gradientResultTo": [ null, null, "dbf[layerIndex]" ]
},
{
"id": "intermediate_f_1",
"description": "Compute the forget gate",
"type": "MatrixTransposeOperation",
"inputs": [ "f_add" ]
},
{
"id": "intermediate_f_2",
"description": "Compute the forget gate",
"type": "LayerNormalizationOperation",
"inputs": [ "intermediate_f_1" ]
},
{
"id": "intermediate_f_3",
"description": "Compute the forget gate",
"type": "MatrixTransposeOperation",
"inputs": [ "intermediate_f_2" ]
},
{
"id": "f",
"description": "Compute the forget gate",
"type": "AmplifiedSigmoidOperation",
"inputs": [ "intermediate_f_3" ],
"setResultTo": "f[t][layerIndex]"
},
{
"id": "wi_currentInput",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wi[layerIndex]", "currentInput" ],
"gradientResultTo": [ "dWi[layerIndex]", null ]
},
{
"id": "ui_previousHiddenState",
"type": "MatrixMultiplyOperation",
"inputs": [ "Ui[layerIndex]", "previousHiddenState" ],
"gradientResultTo": [ "dUi[layerIndex]", null ]
},
{
"id": "i_add",
"type": "MatrixAddThreeOperation",
"inputs": [ "wi_currentInput", "ui_previousHiddenState", "bi[layerIndex]" ],
"gradientResultTo": [ null, null, "dbi[layerIndex]" ]
},
{
"id": "intermediate_i_1",
"description": "Compute the input gate",
"type": "MatrixTransposeOperation",
"inputs": [ "i_add" ]
},
{
"id": "intermediate_i_2",
"description": "Compute the input gate",
"type": "LayerNormalizationOperation",
"inputs": [ "intermediate_i_1" ]
},
{
"id": "intermediate_i_3",
"description": "Compute the input gate",
"type": "MatrixTransposeOperation",
"inputs": [ "intermediate_i_2" ]
},
{
"id": "i",
"description": "Compute the input gate",
"type": "AmplifiedSigmoidOperation",
"inputs": [ "intermediate_i_3" ],
"setResultTo": "i[t][layerIndex]"
},
{
"id": "wc_currentInput",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wc[layerIndex]", "currentInput" ],
"gradientResultTo": [ "dWc[layerIndex]", null ]
},
{
"id": "uc_previousHiddenState",
"type": "MatrixMultiplyOperation",
"inputs": [ "Uc[layerIndex]", "previousHiddenState" ],
"gradientResultTo": [ "dUc[layerIndex]", null ]
},
{
"id": "cHat_add",
"type": "MatrixAddThreeOperation",
"inputs": [ "wc_currentInput", "uc_previousHiddenState", "bc[layerIndex]" ],
"gradientResultTo": [ null, null, "dbc[layerIndex]" ]
},
{
"id": "intermediate_cHat_1",
"description": "Compute the candidate memory cell state",
"type": "MatrixTransposeOperation",
"inputs": [ "cHat_add" ]
},
{
"id": "intermediate_cHat_2",
"description": "Compute the candidate memory cell state",
"type": "LayerNormalizationOperation",
"inputs": [ "intermediate_cHat_1" ]
},
{
"id": "intermediate_cHat_3",
"description": "Compute the candidate memory cell state",
"type": "MatrixTransposeOperation",
"inputs": [ "intermediate_cHat_2" ]
},
{
"id": "cHat",
"description": "Compute the candidate memory cell state",
"type": "TanhOperation",
"inputs": [ "intermediate_cHat_3" ],
"setResultTo": "cHat[t][layerIndex]"
},
{
"id": "f_previousMemoryCellState",
"type": "HadamardProductOperation",
"inputs": [ "f[t][layerIndex]", "previousMemoryCellState" ]
},
{
"id": "i_cHat",
"type": "HadamardProductOperation",
"inputs": [ "i[t][layerIndex]", "cHat[t][layerIndex]" ]
},
{
"id": "newC",
"description": "Compute the memory cell state",
"type": "MatrixAddOperation",
"inputs": [ "f_previousMemoryCellState", "i_cHat" ]
},
{
"id": "newCTransposed",
"type": "MatrixTransposeOperation",
"inputs": [ "newC" ]
},
{
"id": "newCNormalized",
"type": "LayerNormalizationOperation",
"inputs": [ "newCTransposed" ]
},
{
"id": "c",
"type": "MatrixTransposeOperation",
"inputs": [ "newCNormalized" ],
"setResultTo": "c[t][layerIndex]"
},
{
"id": "wo_currentInput",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wo[layerIndex]", "currentInput" ],
"gradientResultTo": [ "dWo[layerIndex]", null ]
},
{
"id": "uo_previousHiddenState",
"type": "MatrixMultiplyOperation",
"inputs": [ "Uo[layerIndex]", "previousHiddenState" ],
"gradientResultTo": [ "dUo[layerIndex]", null ]
},
{
"id": "o_add",
"type": "MatrixAddThreeOperation",
"inputs": [ "wo_currentInput", "uo_previousHiddenState", "bo[layerIndex]" ],
"gradientResultTo": [ null, null, "dbo[layerIndex]" ]
},
{
"id": "o",
"description": "Compute the output gate",
"type": "LeakyReLUOperation",
"inputs": [ "o_add" ],
"setResultTo": "o[t][layerIndex]"
},
{
"id": "c_tanh",
"type": "TanhOperation",
"inputs": [ "c" ]
},
{
"id": "newH",
"type": "HadamardProductOperation",
"inputs": [ "o[t][layerIndex]", "c_tanh" ]
},
{
"id": "keys",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wk[layerIndex]", "embeddedInput" ],
"gradientResultTo": [ "dWk[layerIndex]", null ]
},
{
"id": "queries",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wq[layerIndex]", "previousHiddenState" ],
"gradientResultTo": [ "dWq[layerIndex]", null ]
},
{
"id": "values",
"type": "MatrixMultiplyOperation",
"inputs": [ "Wv[layerIndex]", "embeddedInput" ],
"gradientResultTo": [ "dWv[layerIndex]", null ]
},
{
"id": "queriesTranspose",
"type": "MatrixTransposeOperation",
"inputs": [ "queries" ]
},
{
"id": "dotProduct",
"description": "Compute the dot product of the queries and keys",
"type": "MatrixMultiplyOperation",
"inputs": [ "keys", "queriesTranspose" ]
},
{
"id": "scaledDotProduct",
"description": "Scale the dot product",
"type": "MatrixMultiplyScalarOperation",
"inputs": [ "dotProduct", "scaledDotProductScalar" ]
},
{
"id": "scaledDotProductTranspose",
"type": "MatrixTransposeOperation",
"inputs": [ "scaledDotProduct" ]
},
{
"id": "attentionWeights",
"type": "SoftmaxOperation",
"inputs": [ "scaledDotProductTranspose" ]
},
{
"id": "attentionOutput",
"type": "MatrixMultiplyOperation",
"inputs": [ "attentionWeights", "values" ]
},
{
"id": "newHWithAttentionOutput",
"type": "MatrixAddOperation",
"inputs": [ "newH", "attentionOutput" ]
},
{
"id": "newHWithAttentionOutputTranspose",
"type": "MatrixTransposeOperation",
"inputs": [ "newHWithAttentionOutput" ]
},
{
"id": "normalizedNewH",
"type": "LayerNormalizationOperation",
"inputs": [ "newHWithAttentionOutputTranspose" ]
},
{
"id": "h",
"type": "MatrixTransposeOperation",
"inputs": [ "normalizedNewH" ],
"setResultTo": "h[t][layerIndex]"
}
]
}
],
"endOperations": [
{
"id": "v_h",
"type": "MatrixMultiplyOperation",
"inputs": [ "V", "hFromCurrentTimeStepAndLastLayer" ],
"gradientResultTo": [ "dV", null ]
},
{
"id": "v_h_b",
"type": "MatrixAddOperation",
"inputs": [ "v_h", "b" ],
"gradientResultTo": [ null, "db" ]
},
{
"id": "output_t",
"type": "AmplifiedSigmoidOperation",
"inputs": [ "v_h_b" ],
"setResultTo": "output[t]"
}
]
}
]
}
Instantiate the architecture
Use a JSON serialization library like Newtonsoft.JSON to deserialize the JSON file to a JSONArchitecure object.
Instantiate and populate the operations
Instantiate each operation based on its type. Then set the Next property of each operation to be the next operation in the forward pass.
Add each operation that is backwards in the computation graph to the BackwardAdjacentOperations property of an operation. BackwardAdjacentOperations is just a list of operations.
Add instances of the gradients to send the result to, to the GradientDestinations array. If there is no gradient result for a certain input, add null.
Then populate the backward dependency counts by running the following code. It only has to be run once to set up the backward dependency counts.
for (int t = numTimeSteps - 1; t >= 0; t--) // if there are multiple timesteps
{
backwardStartOperation = operationsMap[$"output_t_{t}"]; // the backward start operation
OperationGraphVisitor opVisitor = new OperationGraphVisitor(Guid.NewGuid().ToString(), backwardStartOperation, t);
await opVisitor.TraverseAsync(); // sets the backward dependency counts
await opVisitor.ResetVisitedCountsAsync(backwardStartOperation);
}
Run the forward pass
var op = startOperation; // the start operation
IOperation currOp = null;
do
{
var parameters = LookupParameters(op); // lookup the parameters
op.OperationType.GetMethod("Forward").Invoke(op, parameters); // call the forward function
if (op.ResultToName != null)
{
op.ResultTo(NameToValueFunc(op.ResultToName)); // send the result to the appropriate object
}
operationsMap[op.SpecificId] = op;
currOp = op;
if (op.HasNext)
op = op.Next;
} while (currOp.Next != null);
Run the backward pass utilizing inherent parallelization
for (int t = numTimeSteps - 1; t >= 0; t--)
{
backwardStartOperation = operationsMap[$"output_t_{t}"];
if (gradientOfLossWrtOutput[t][0] != 0.0d)
{
backwardStartOperation.BackwardInput = new double[][] { gradientOfLossWrtOutput[t] };
OperationNeuralNetworkVisitor opVisitor = new OperationNeuralNetworkVisitor(Guid.NewGuid().ToString(), backwardStartOperation, t);
await opVisitor.TraverseAsync();
opVisitor.Reset();
traverseCount++;
}
}
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.1 is compatible. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.1
- StyleCop.Analyzers (>= 1.1.118)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated | |
---|---|---|---|
1.2.17 | 92 | 10/30/2024 | |
1.2.16 | 125 | 10/27/2024 | |
1.2.15 | 88 | 10/22/2024 | |
1.2.14 | 127 | 10/13/2024 | |
1.2.13 | 96 | 10/11/2024 | |
1.2.12 | 116 | 10/6/2024 | |
1.2.11 | 123 | 9/22/2024 | |
1.2.10 | 102 | 9/1/2024 | |
1.2.9 | 148 | 8/31/2024 | |
1.2.8 | 104 | 8/29/2024 | |
1.2.7 | 120 | 8/28/2024 | |
1.2.6 | 122 | 7/4/2024 | |
1.2.5 | 123 | 7/4/2024 | |
1.2.4 | 145 | 7/2/2024 | |
1.2.3 | 123 | 6/30/2024 | |
1.2.2 | 122 | 6/27/2024 | |
1.2.1 | 141 | 4/13/2024 | |
1.2.0 | 119 | 4/1/2024 | |
1.1.65 | 148 | 1/20/2024 | |
1.1.64 | 128 | 1/10/2024 | |
1.1.63 | 127 | 1/9/2024 | |
1.1.62 | 144 | 1/8/2024 | |
1.1.61 | 131 | 1/7/2024 | |
1.1.60 | 125 | 1/7/2024 | |
1.1.59 | 115 | 1/7/2024 | |
1.1.58 | 125 | 1/6/2024 | |
1.1.57 | 129 | 1/6/2024 | |
1.1.56 | 125 | 1/6/2024 | |
1.1.55 | 115 | 1/6/2024 | |
1.1.54 | 130 | 1/5/2024 | |
1.1.53 | 135 | 1/4/2024 | |
1.1.52 | 127 | 1/4/2024 | |
1.1.51 | 124 | 1/4/2024 | |
1.1.50 | 127 | 1/3/2024 | |
1.1.49 | 129 | 1/3/2024 | |
1.1.48 | 139 | 1/3/2024 | |
1.1.47 | 128 | 1/3/2024 | |
1.1.46 | 123 | 1/3/2024 | |
1.1.45 | 123 | 1/3/2024 | |
1.1.44 | 131 | 1/3/2024 | |
1.1.43 | 126 | 1/3/2024 | |
1.1.42 | 133 | 1/2/2024 | |
1.1.41 | 136 | 1/2/2024 | |
1.1.40 | 143 | 1/2/2024 | |
1.1.39 | 150 | 1/1/2024 | |
1.1.38 | 136 | 1/1/2024 | |
1.1.37 | 139 | 1/1/2024 | |
1.1.36 | 148 | 1/1/2024 | |
1.1.35 | 133 | 1/1/2024 | |
1.1.34 | 136 | 12/31/2023 | |
1.1.33 | 135 | 12/25/2023 | |
1.1.32 | 110 | 12/25/2023 | |
1.1.31 | 140 | 12/24/2023 | |
1.1.30 | 119 | 12/24/2023 | |
1.1.29 | 170 | 9/25/2023 | |
1.1.28 | 127 | 9/25/2023 | |
1.1.27 | 138 | 9/16/2023 | |
1.1.26 | 163 | 9/7/2023 | |
1.1.25 | 144 | 9/7/2023 | |
1.1.24 | 158 | 9/7/2023 | |
1.1.23 | 135 | 9/7/2023 | |
1.1.22 | 146 | 9/6/2023 | |
1.1.21 | 145 | 9/6/2023 | |
1.1.20 | 142 | 9/6/2023 | |
1.1.19 | 155 | 9/5/2023 | |
1.1.18 | 146 | 9/4/2023 | |
1.1.17 | 121 | 9/4/2023 | |
1.1.16 | 153 | 9/4/2023 | |
1.1.15 | 145 | 9/4/2023 | |
1.1.14 | 175 | 7/12/2023 | |
1.1.13 | 165 | 7/11/2023 | |
1.1.12 | 163 | 7/10/2023 | |
1.1.11 | 156 | 7/9/2023 | |
1.1.10 | 157 | 7/9/2023 | |
1.1.9 | 146 | 7/9/2023 | |
1.1.8 | 153 | 7/8/2023 | |
1.1.7 | 182 | 7/8/2023 | |
1.1.6 | 140 | 7/7/2023 | |
1.1.5 | 150 | 7/7/2023 | |
1.1.4 | 175 | 7/6/2023 | |
1.1.3 | 157 | 7/5/2023 | |
1.1.2 | 162 | 7/5/2023 | |
1.1.1 | 178 | 7/3/2023 | |
1.1.0 | 179 | 7/3/2023 | |
1.0.61 | 186 | 7/1/2023 | |
1.0.60 | 161 | 6/30/2023 | |
1.0.59 | 180 | 6/29/2023 | |
1.0.58 | 165 | 6/27/2023 | |
1.0.57 | 165 | 6/27/2023 | |
1.0.56 | 169 | 6/26/2023 | |
1.0.55 | 160 | 6/26/2023 | |
1.0.54 | 166 | 6/24/2023 | |
1.0.53 | 170 | 6/24/2023 | |
1.0.52 | 166 | 6/23/2023 | |
1.0.51 | 163 | 6/21/2023 | |
1.0.50 | 174 | 6/20/2023 | |
1.0.49 | 162 | 6/20/2023 | |
1.0.48 | 173 | 6/20/2023 | |
1.0.47 | 169 | 6/19/2023 | |
1.0.46 | 157 | 6/17/2023 | |
1.0.45 | 163 | 6/16/2023 | |
1.0.44 | 163 | 6/16/2023 | |
1.0.43 | 181 | 6/14/2023 | |
1.0.42 | 166 | 6/13/2023 | |
1.0.41 | 175 | 6/13/2023 | |
1.0.40 | 210 | 6/11/2023 | |
1.0.39 | 177 | 5/30/2023 | |
1.0.38 | 179 | 5/30/2023 | |
1.0.37 | 177 | 5/30/2023 | |
1.0.36 | 173 | 5/30/2023 | |
1.0.35 | 179 | 5/29/2023 | |
1.0.34 | 183 | 5/28/2023 | |
1.0.33 | 172 | 5/27/2023 | |
1.0.32 | 181 | 5/22/2023 | |
1.0.31 | 179 | 5/18/2023 | |
1.0.30 | 190 | 5/18/2023 | |
1.0.29 | 175 | 5/18/2023 | |
1.0.28 | 158 | 5/16/2023 | |
1.0.27 | 184 | 5/16/2023 | |
1.0.26 | 178 | 5/13/2023 | |
1.0.25 | 159 | 5/12/2023 | |
1.0.24 | 206 | 5/12/2023 | |
1.0.23 | 184 | 5/12/2023 | |
1.0.22 | 187 | 5/12/2023 | |
1.0.21 | 189 | 5/12/2023 | |
1.0.20 | 212 | 5/12/2023 | |
1.0.19 | 191 | 5/12/2023 | |
1.0.18 | 189 | 5/10/2023 | |
1.0.17 | 193 | 5/9/2023 | |
1.0.16 | 204 | 5/9/2023 | |
1.0.15 | 185 | 5/9/2023 | |
1.0.14 | 196 | 5/9/2023 | |
1.0.13 | 182 | 5/9/2023 | |
1.0.12 | 187 | 5/8/2023 | |
1.0.11 | 217 | 5/8/2023 | |
1.0.10 | 225 | 5/8/2023 | |
1.0.9 | 197 | 5/7/2023 | |
1.0.8 | 187 | 5/6/2023 | |
1.0.7 | 191 | 5/5/2023 | |
1.0.6 | 173 | 5/4/2023 | |
1.0.5 | 173 | 5/4/2023 | |
1.0.4 | 173 | 5/4/2023 | |
1.0.3 | 195 | 5/3/2023 | |
1.0.2 | 197 | 5/3/2023 | |
1.0.1 | 189 | 5/3/2023 | |
1.0.0 | 194 | 5/2/2023 |
Third release of ParallelReverseAutoDiff.